Last month the R/Pharma conference took place at the Harvard Campus in Boston. I presented our development project called bioWARP, which is a large shiny application of containing more than 500.000 lines of code. And I noticed, actually nobody else ever came to the idea to write such a shiny application. And I asked myself, why? The main reason is, people do not need to do that! I now would like to explain why we needed it, how we went our way and made it.
Imagine our large shiny app as a truck, it’s nearly a monster truck, but let’s leave it a truck. It’s a truck because our app has more than 500 interaction items. Measuring these as horse power (HP) it’s a kind of regular truck, having 500 HP. Shiny apps I see in my daily work have about 50 or even less interaction items, so it these be seen as a car. With less then 50, it’s a rather small car like a mini cooper. So why did my customers wanted to drive a truck?
Building software often starts with checking the user requirements. So when we started the development of our statistical web application, we did that, too. Asking a lot of people inside our deparment we noticed, that the list of requirements was huge:
Main user requirements
Main application features
More requirements came then from all the analysis people need on a daily basis and want to have them integrated into our app:
Mathematical features
Additionally it was required to write the whole application in R as all our mathematical packages are written in R. So we decided for doing it all with shiny because it already covers 2 of the 3 Main features.
Inside our department we were running some large scale desktop applications already. When it came to testing we always noticed, that testing takes forever. If one single same software gathers data, calculates statistics, provides plot outputs and renders PDF reports, this truck is huge and you can just test it by driving it a thousand kilometers and see if it still works. The idea we came up with was building our truck out of Lego bricks. Each Lego brick can be tested if it is strong enough. If a Lego wheel runs, the truck will run. The wheel holder part is universal and if we change the size of the wheels, we can still run the truck. What this is called, is modularity. For this there exist different solutions in R and shiny which can be combined:
As Shiny modules were not existing when we started, we chose option 2 and 3.
As an example I’ll compare two simple shiny apps representing two cars here. One is written using object orientation, one as a simple shiny application. The image below shall illustrate, that the renderPlot function in a standard shiny app includes a plot, in this case using the hist function. So whenever you add a new plot, its function has to be called inside.
In the object our oriented app the renderPlot function calls the shinyElement method of a generic plot object we created and called AnyPlot. The fist advantage is that plot can easily be exchanged. Please look into the code if you wonder if this really is so. To describe that advantage, you can imagine a normal car, built of car parts. Our car is really a a Lego car, using even smaller standardized parts (Lego bricks), to construct each part of the car. So instead of the grille made of one piece of steal, we constructed it of many little grey Lego bricks.
By going into the code of the two applications, you see there is a straight forward disadvantage. There is much more code in the object oriented app. We have to define what a Lego brick is and what features it shall have.
library(methods)
library(rlang)
setGeneric("plotElement",where = parent.frame(),def = function(object){standardGeneric("plotElement")})
setGeneric("shinyElement",where = parent.frame(),def = function(object){standardGeneric("shinyElement")})
setClass("AnyPlot", representation(plot_element = "call"))
setClass("HistPlot", representation(color="character",obs="numeric"), contains = "AnyPlot")
AnyPlot <- function(plot_element=expr(plot(1,1))){
new("AnyPlot",
plot_element = plot_element
)
}
HistPlot <- function(color="darkgrey",obs=100){
new("HistPlot",
plot_element = expr(hist(rnorm(!!obs), col = !!color, border = 'white')),
color = color,
obs = obs
)
}
#' Method to plot a Plot element
setMethod("plotElement",signature = "AnyPlot",definition = function(object){
eval(object@plot_element)
})
#' Method to render a Plot Element
setMethod("shinyElement",signature = "AnyPlot",definition = function(object){
renderPlot(plotElement(object))
})
server <- function(input, output, session) {
# Create a reactive to create the Report object
report_obj <- reactive(HistPlot(obs=input$obs))
# Check for change of the slider to change the plots
observeEvent(input$obs,{
output$renderedPDF <- renderText("")
output$renderPlot <- shinyElement( report_obj() )
} )
}
# Simple shiny App containing the standard histogram + PDF render and Download button
ui <- fluidPage(
sidebarLayout(
sidebarPanel(
sliderInput(
"obs",
"Number of observations:", min = 10, max = 500, value = 100)
),
mainPanel(
plotOutput("renderPlot")
)
)
)
shinyApp(ui = ui, server = server)
server <- function(input, output) {
# Output Gray Histogram
output$distPlot <- renderPlot({
hist(rnorm(input$obs), col = 'darkgray', border = 'white')
})
}
# Simple shiny App containing the standard histogram + PDF render and Download button
ui <- fluidPage(
sidebarLayout(
sidebarPanel(
sliderInput(
"obs",
"Number of observations:", min = 10, max = 500, value = 100)
),
mainPanel(
plotOutput("distPlot")
)
)
)
shinyApp(ui = ui, server = server)
Another advantage of the object orientation is that you can now output the plot in a lot of different formats. We solved this by introducing methods called pdfElement, logElement or archiveElement. To get a deeper look you can check out some examples stored on github. These show differences between object oriented and standard shiny apps. You can see that duplicated code is reduced in object oriented apps, additionally the code of the shiny app itself does not change for object oriented apps. But the code constructing the objects shown on the page changes. While for the standard apps the shiny code itself also changes everytime an element is updated.
The main advantage of this approach is, that you can keep your shiny app exactly the same whatever it calculates or whatever it reports. Inside our department this meant, whenever somebody updates an r-package building plots, we do not have to touch our main app again. Whenever somebody wanted to change just the linear regression app, we did not have to touch other apps. The look and feel, the logging, the PDF report, stays exactly the same. Those 3 functionalities shall never be touched in case no update is needed.
Inside our development we noticed, that we do not have just one singular app to build, we will have many. So we decided for each app we will construct a separate R-package. This means we had to define one Class that defines what an app will look like in a core-package. Each contribute to our shiny app then build a package that contains a child of our main class. We called this class Module. So we got a lot of Module-packages. This is not a shiny-module, but it’s modular. Our app now allows bringing together a lot of those modules and making it bigger and bigger and bigger. Yeah, we have a truck! Made of Lego!
The modularization and packaging now enables fast testing. Why? Each package can be tested using basic testthat functionalities. So first we tested our main application package, that allows adding building blocks. Afterwards we tested each single package on its own. Finally, the whole application is tested. Our truck is ready to roll. Upon updates, we do not have to test the whole truck again. If we want to have larger tires, we just update the tire package, but not the core-package or other packages.
The truck is made of bricks, actually the same bricks we used to build the car. Just many more of them. Now the hard part is still putting them all together.
We are dealing with many the different Modules that we were writing. Each module comes in one package. The main issue we had was that we wanted all apps to be deeply tested. During development of course not all apps were tested right away, so we had to give the a tag. Additionally some apps required help pages, others don’t. Some apps came with example data sets, some don’t, which were stored in an additional data sets folder. Some apps had a nice title in them already, for some it shall be easy to configure. For each Module we’ll also have to source js and css files, which we allowed to be additionally added for each app. The folder where to source them shall be chosen by the app author. We wanted to provide as much flexibility as possible while keeping our standards for Lego bricks (Look&Feel, logging, plotting and reporting).
We came up with the idea of config XML files. So the XML file contains all the information needed to tell what needs to be set for each Module. An example XML is given below which you can see as the LEGO manual. These small configurations allow managing the apps. We also build an XML that allows the apps to use features of the what we call call-package. This XML file is rather difficult to set up. But imagine it tells which Plot shall be logged, which input shall be used and which plots shall go into the PDF report. It allows fast development while sticking to standards. A whole example can be found on github.
<module id="module1" type="default" datasets="yes" tested="no">
<package> modulepackage1 </package>
<class> modulepackage1_Module </class>
<title> Great BoxPlot Module </title>
<short> GBM </short>
<path source="modulepackage1"> . </path>
<help>
<level0>help/index.html</level0>
<level1>
<item name="details">help/about.html</item>
</level1>
</help>
<data>
<ds name="Two Groups" file="datasets/two_groups.csv">
</data>
</module>
Inside the config file you can clearly see that now the title of the app and the location of help pages, example data sets is given. Even the name of the class that describes the Module is given. This allows us to rapidly add modules to our main app environment.
At the end our truck is made of many parts, that all increase its power and strength. As we now have around 16 modules in our app and each has between 20 and 50 inputs, the truck has 500 inputs. All which look similar and can be used to produced standardized PDF reports. The truck can even become a monster truck and thanks to the config files will still be easy to manage.